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ABSTRACT 

In  this  report  we  describe  a  security  coding  system  which 

enables  the  user  to  log  onto  a  system  by  using  his  code  number  X 

i 

which  is  immediately  transformed  into  a  pseudo  code  word  Y.  =  f(X. ) 
by  the  machine.   Even  though  Y.  and  f  are  public  knowledge  it  is  not 
possible  to  log  onto  the  system  without  knowing  X,  and  the  equation 
Y  =  f(x)  cannot  be  solved  for  X,  even  if  Y  is  known. 


THE  SECURITY  CODE 

Some  time  in  January,  1973  we  were  asked  "by  Larry  Roberts  of 
Arpa  to  design  a  coding  system  to  prevent  unauthorized  use  of  machines  on 
the  Arpa  network.   The  problem  somewhat  loosely  phrased,  was  to  find  a  function 
f  such  that  the  user's  password  X  would  "be  transformed  into  Y,  where 
Y  =  f(X),  so  that  X  would  not  he  stored  in  the  machine  and  would  therefore 
not  he  accessible  to  would  be  crackers  of  the  system. 

I  suggested  a  method,  which  was  immediately  rejected  by  Larry 
Roberts  on  the  grounds  that  the  cracker  could  solve  for  X,  given  Y.   It  was 
only  then  that  the  actual  problem  became  clear.   Namely,  it  must  be  impossible 
to  invert  f,  since  f  will  be  known  to  the  cracker  if  the  system  has  no  read 
protection.   I  then  suggested,  and  we  later  implemented  a  method  using  an  f 
that  could  not  practically  be  inverted. 

Here  is  the  description  of  the  first  method  which  I  sent  to  Larry: 

We  think  of  the  x.  as  being  code  words  for  l£ifn  and  we  shall  think 
of  the  x.  as  being  integers  between  1  and  M.   Typically  we  might  expect  n, 

ko 

the  number  of  users  to  be  about  1000  and  M  to  be  about  10   .   The  Y.  .  we  think 

of  as  the  time  varying  code  for  X.  at  time  t..   A  suggested  time  interval 

t.  -  t.  .  is  one  day. 
i    l-l 

th 
Each  Y. .  =  f. (x.),  where  f.(x)  is  the  i   of  our  permutations  on 
ij    l   J  i 

{1,  ...,  M}.   We  make  f . (x)  have  the  property  that  f.(x)  never  equals  x. 

A  very  convenient  family  of  permutations  is  the  family 

f . (x)  =  k.x  (mod  P) 
i       i 

where  P  is  a  prime  number  very  close  to  M  (and  we  may  need  some  computer 
time  to  find  it)  and  k.  is  the  i   random  integer  between  2  and  P-l  (it 


is  very  easy  to  produce  good  random  k.).   We  need  an  inverse  permutation 

g. (x)  =  h.  x  (mod  P)  where  h.  is  the  mod  P  inverse  of  k.  and  is  found  from 

11  1  1 

k.    by   Euclids    algorithm  at   the  time  that   k.    is    generated.      The  program 
which   calculates    f  and  g  will  he   very   short;    they  will    involve  multi-precision 
arithmetic,    hut   this    is   not   too   difficult   to   program.      The   effect   of   f  and   g 
will   he   completely   unpredictable   even  to   someone  who  has    constant    access   to 
the  Y. .    that    are   stored  and  P,    because   the  k.    are  produced  hy   a  random  number 
generator. 

How   do   we   use  the    system?      F  will  be   used  to   check   if  X.    is   the   code 

l 

of  some  account,,  to  see  how  much  money  is  left  in  account  X.  t    an(j  to  change 

account  X.  for  time  used.  The  function  G  may  be  used  ax   the  end  of  the  day 

(or  other  time  interval)  for  reading  off  the  account  information  into  some 

more  permanent  record,  if  this  is  desired.   The  other  purpose  of  G  is  to 

recode  all  of  the  Y. .  when  the  time  interval  changes  from  t.  to  t.  n .   Actuallv, 

ij  l     l+l 

we  would  want  the  composition 

f    (g.  (x))  =  Is..,,    h.  x  (mod  P) 
l+l   i      .  l+l   i 

for  that   purpose,    and  of  course  k .  ,  _,    h.    onlv  has   to  be  computed  once  mod  P 

l+l      i 

for   each   time   interval. 

We  have   some   ideas   about   how  the  Y. .    should  be   stored,    but   there   is 
no   reason  why  they   shouldn't   be   stored  in  the    same  way  that   the  X.    are   pre- 
sently  stored.      It   would  be   especially  uncrackable,   however,    if  the  Y..   were 

A.  J 

stored,  along  with  their  account  information,  by  means  of  a  HASH  code  and  if 
the  account  information  such  as  dollars  available  were  coded  by  F's  and  G's 
also. 

Because  of  the  many  details  involved  we  would  rather  write  the 
programs  ourselves  than  send  all  of  the  many  details  to  you. 


This  is  the  reply  that  I  got  from  Larry  Roberts: 

Mon.,  Feh.  5,  1973 
The  system  you  proposed  appears  to  store  the  remainder  of  the  product 
of  your  password  and  K(l)  mod  P  as  the  word  to  he  recognized,  Y. 
Thus: 

Y(J)  +  N  *  P  =  X(J)  *  K(I) 

I  must   assume   for  any   system  that   a   system  cracker  has    full   access   to 
all   info   except  X.      Thus  he  need  only  proceed  to   use  a  linear   diophantine 
solution   technique  on  the   above   equation  until  he  has   X.      All   this    can 
be   done   during  a  momentary  penitration  of  system  security.      If  I   am 
correct    a   far  better  technique   is   needed,    one  which   no  one    can   crack 
even  given  all  the   info  necessary  to   check  the  password  E.G.      The   system 
operator   should  not  be   able  to   determine  my  password.      I  also   can't    see 
what    added  security   is   added  by   changing  K  each   day.      Any  day  I  break 
in   K  is   there   along  with   all   the   Y's. 

Please  run   any   scheme  you  come  up  with   past    several   good  people  to   see 
if  they   can  break  it  before  considering  programming  it.      Tell  me   if  I 
misunderstood  or  about   other  ideas. 
Larry  Roberts 

I  then   sent   this   note  to   Larry  Roberts: 

Date:      Feb.    6,    1973      2:1+5  p.m. 

From:      Purdy 

Re:  Security  System 

You  are  right,    the   system  we  proposed  does    indeed  store  the   remainder 
of  the   product   of  the  password  and  K(l)   mod  P   as  the  word  to  be   recognized, 
Y(I, J). 

Y(I,J)    +  N(I,J)    *   P  =   X(J)    *   K(I) 
We  are   actually   a  little   surprised,   however,   that   a   system  cracker  has   full 
access  to   all   info   except   X.      We  had  imagined  that   the  number  K(l)    could  be 
kept   in   a  reserved  part  of  memory.      If  the   user  had  access   to   K(l),   then  you 
are  quite   correct,     K(l)  could  be  computed  using  Euclid's    algorithm,   or  he 
could  apply  Euclid's   algorithm  directly  to  the  Diophantine   equation. 

From  here  on,    let   us   assume  therefore  that   the   CRACKER  has   know- 
ledge of  everything  except  the  password  X. 

The   code  will  be  Y  =   f(x)    and  the   cracker  will  have  knowledge   of 
f.      Then  we   arrange  that    f  has   no   easily   computable   inverse.      For   example, 


if   f(x)    =  polynomial   of   degree   6    (mod  P)   then  the   inverse  g  of  f,   which 
of   course   is    6-valued,    will   probably   not  be   computable  by   any   reasonable 
algorithm.      There   is   one   algorithm  for   computing  g  which    always    exists  — 
namely   the   cracker  tries   all  possible   code  words   X  until  one   comes   up   for 
which    f(X)    =  Y.       There   are   P  possible    code  words,    so   that    if  P   is    about 
1.0EU.0   then   it   would  take  too   long  to    do   this.      However   for  this   to  be  true 
it    is    essential   that   the   code  words   be  truly  random,    and  they  will   probably 
be  hard  to    remember;    there   is    no  way  to   avoid  this    unpleasant    aspect   of  the 
system  if   f  is   public   information.      To    get    around  the   non-uniqueness   of  the 
inverse   of   f(x),    the   cede  Y   for  the  user   code  would  be  the   vector    (F(xj,   w) 
where  the   user   code    is    (x,   w).       The  word  w  only  needs   to  be   long   enough  to 

guarantee   uniqueness    and   it    is   no   help  to   the   cracker,    e.g.    w  might   be  the 

39 
account  number.       (Since  there  are   at   least   P/6   "   10   '    possible  values    for 

f(x),    it   is   very  unlikely  that    coincidences  will  occur   in   any   case,    so  we 

might   just    do  without  w).      Some  time   should  be   spent    finding  a  suitable   f. 

It   seems   to   us   that  this    system  satisfies  your  requirements,   but 
there   is    always  the  possibility  that  we  have  not   understood  the  problem 
entirely. 

I  then   got  this    reply  from  Larry  Roberts: 

Date:      Feb.    8,   1973     191^ 
From:       Larry  Roberts 
Re:  Security   System 

You  now  understand  the  problem.      A  penetrater  easily  makes  himself 
a  wheel   and  then   accesses   everything,   but    for  a   short  while.      T  hope 
you  can   find  a  function   soon   since  we  are  pushing   for  full   security 
in  the  NET  in  the   next    few  months.      In   some  of  the   systems  without 
special  hardware  all   system  files   are  readable,   but   not  writeable  by 
the  users.      Here   such  a   security   system  is  mandatory. 

Don't    ignore   complex  boolean    functions    (nonlinear)    since  these    should 
be  more  profitable. 

Hope  to  hear   from  you  soon. 
Larry  Roberts 


Then  Larry  Roberts   came  to   visit    and  we  thrashed  out   a  few  details 
and  I  eventually  sent  him  the   following: 

In  what    follows,   we   describe   a   security   coding   system  which   enables 
the   user  to   log  onto    a   system  by   using  his    code  number  X.    which    is    immediately 
transformed  into   a  pseudo   code  word  Y.    =   f (X. )  by  the  machine.      Even  though 
Y.    and  f  are  public  knowledge   it   is  not   possible  to   log  onto  the   system  with- 
out  knowing  X,    and  the   equation  Y  =   f (x)    cannot  be   solved  for  X,    even   if  Y 
is  known. 
§1  the   function   f(x) 

The   code   is  of  the   form  Y  =    f(X),    and  the   cracker  knows    f.      We 

have  arranged  that   the   equation   f(X)   =  Y  is   very  unlikely  to  be  solved  in 

fewer  than  10      seconds  of  processor  time.      The   function   f  is   a  polynomial 

modulo   a  prime   P. 

f(X)   =  Xn  +   a,    X™  +   a0   X3   +   a0  X2   +   a.    X  +   a      (mod  P), 
k  3  2  i  o 

6k  2k  2k 

where  P  =  2        -  59,   m  =  2        +   3,   n  =  2        +17, 

and  the   a.    are  19-digit   numbers.      The   cracker  has   essentially  two   approaches 

for  solving  Y  =   f(X)    given  Y.      He  can  use  trial   and  error,    or  Berlekamp's   and 

similar  algorithms.      It  was  necessary  to  make  n  and  P   fairly  large   in  order 

to   defeat   both   approaches. 

§2   Berlekamp's  Algorithm  as   a  Threat 

Berlekamp's   method  for  completely   factoring  polynomials   modulo   P 

can  be  applied  to  the  polynomial   f(X)    -  Y  =   g(x)    and  it   requires    [l]    at  least 

3  2 

n      (log  P)      operations.      There  are  no   algorithms  known  which   are   faster  than 

2  2 

this,    so   it    seems   safe  to  say  that   n      (log  P)      operations    are   required  to 

find  just   a  single   root. 

Now  n  -  10     and  P  =  10      ,    so  more  than  10        operations   are  required. 


Let   us   say  that  the   speed  s   of  the  crackers   machine  is   10        operations   per 
second   (faster  than   ILLIAC   IV).      Then   it  would  still  take  more  than  T  =  10 
seconds   ~   two  weeks. 
§3  The  Trial   and  Error  Threat 


We   assume  that   the   cracker  has    a  list   of   all   assigned  Y.    and  he 
keeps   trying  values   of  X  until   f(X)    =   Y.    for   some    i.      Let    c  he   the   number   of 
X.    assigned  to  users.      A  theorem  of  Lagrange  guarantees  that   no  more  than  n 

of  the   X's  will  map  into   one  Y.      Thus,    if  the   cracker   chooses   an  X  at   random 

en 
between   1   and  P,    his    probability   of  success    is    at   most  :r— . 

The  prohahility  P     of   failure  on  the   kth  trial   is    at   least 

en _  en  _ 

1_P-k  +  l~1_P     "a" 

The   expected  number  of  trials   K     before   success  is 

Ke  =  iK(P1P2...Pk)=[1,ak  =  ^ X'-TZ- 

k=l  k=l       (1  "  a)    C  n 

v  p 

The  expected  cracking  time  is  T  =   e  Q  =  P  Q  where  Q  is  the  number  of 

6  s  2   2 

ens 

operations  needed  to   compute   f(X). 

Even   if  Q  =  1,    and  s   =  10      ,    c  =  1000,   we  have 

T   :*&- -=io8 

e     io6  io11*  x  io10 

seconds,    or  about    3  years. 
§U    Implementation 

The   implementation  of  the   algorithm  for   f  uses   multi-precision 
arithmetic   in  the   form  of  some  Fortran   subroutines.      It   is   operational   on 
a  PDP-10   and  requires    about   half  a  second  to   computer  f(X). 


§5  Remarks   at) out  the  use  of  the   code 

When  user-codes   are  assigned,    a  random  number  generator  should  be 

used  to   choose   an  X.    and  then  one   should  verify  that    f (X. )   ^  Y.    for  any 

previous   Y..      This    is    extremely  unlikely,  but    it    could  happen. 
J 

[l]      D.    E.    Knuth,    The  Art   of  Computer  Programming,   Vol.    2,   pp.    381-397, 
Addison-Wesley,    1! 
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